GENEVAL: A Proposal for Shared-task Evaluation in NLG

نویسندگان

Ehud Reiter

Anja Belz

چکیده

We propose to organise a series of sharedtask NLG events, where participants are asked to build systems with similar input/output functionalities, and these systems are evaluated with a range of different evaluation techniques. The main purpose of these events is to allow us to compare different evaluation techniques, by correlating the results of different evaluations on the systems entered in the events.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pragmatic Influences on Sentence Planning and Surface Realization: Implications for Evaluation

Three questions to ask of a proposal for a shared evaluation task are: whether to evaluate, what to evaluate and how to evaluate. For NLG, shared evaluation resources could be a very positive development. In this statement I address two issues related to the what and how of evaluation: establishing a “big picture” evaluation framework, and evaluating generation in context.

متن کامل

A Repository of Data and Evaluation Resources for Natural Language Generation

Starting in 2007, the field of natural language generation (NLG) has organised shared-task evaluation events every year, under the Generation Challenges umbrella. In the course of these shared tasks, a wealth of data has been created, along with associated task definitions and evaluation regimes. In other contexts too, sharable NLG data is now being created. In this paper, we describe the onlin...

متن کامل

Automatic Evaluation of Referring Expression Generation Is Possible

Shared evaluation metrics and tasks are now well established in many fields of Natural Language Processing. However, the Natural Language Generation (NLG) community is still lacking common methods for assessing and comparing the quality of systems. A number of issues that complicate automatic evaluation of NLG systems have been discussed in the literature. 1 The most fundamental observation in ...

متن کامل

Introducing Shared Task Evaluation to NLG The TUNA Shared Task Evaluation Challenges

Shared Task Evaluation Challenges (stecs) have only recently begun in the field of nlg. The tuna stecs, which focused on Referring Expression Generation (reg), have been part of this development since its inception. This chapter looks back on the experience of organising the three tuna Challenges, which came to an end in 2009. While we discuss the role of the stecs in yielding a substantial bod...

متن کامل

Evaluation in Natural Language Generation: The Question Generation Task

Question Generation (QG) is proposed as a shared-task evaluation campaign for evaluating Natural Language Generation (NLG) research. QG is a subclass of NLG that plays an important role in learning environments, information seeking, and other applications. We describe a possible evaluation framework for standardized evaluation of QG that can be used for black-box evaluation, for finer-grained e...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

GENEVAL: A Proposal for Shared-task Evaluation in NLG

نویسندگان

چکیده

منابع مشابه

Pragmatic Influences on Sentence Planning and Surface Realization: Implications for Evaluation

A Repository of Data and Evaluation Resources for Natural Language Generation

Automatic Evaluation of Referring Expression Generation Is Possible

Introducing Shared Task Evaluation to NLG The TUNA Shared Task Evaluation Challenges

Evaluation in Natural Language Generation: The Question Generation Task

عنوان ژورنال:

اشتراک گذاری